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Abstract 

We propose a model of random walks on weighted graphs where the 
weights are interval valued, and connect it to reversible imprecise Markov 
chains. While the theory of imprecise Markov chains is now well estab¬ 
lished, this is a first attempt to model reversible chains. In contrast with 
the existing theory, the probability models that have to be considered are 
now non-convex. This presents a difficulty in computational sense, since 
convexity is critical for the existence of efficient optimization algorithms 
used in the existing models. The second part of the paper therefore ad¬ 
dresses the computational issues of the model. The goal is finding sets 
of weights which maximize or minimize expectations corresponding to 
multiple steps transition probabilities. In particular, we present a local 
optimization algorithm and numerically test its efficiency. We show that 
its application allows finding close approximations of the globally best 
solutions in reasonable time. 
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1 Introduction 

1.1 Modelling uncertainty in Markov chains and weighted 
graphs 

Markov chains with the property that every sequence of states is equally likely no 
matter whether the process runs forwards or backwards are said to be reversible. 
Reversible Markov chains are often interpreted and modelled with random walks 
on weighted graphs on Muni mi hed where the states of the chain are 
the vertices of the graph and transition probabilities are proportional to the 
weights of the edges incident to the initial vertex. Reversible Markov chains 
are often used in Monte Carlo methods (□mum])- Random walks on graphs 
have become very popular in network analysis ([HUI91EQU29]), social networks 
([3 EH USEE]) and web recommender systems ( 1121 1. 
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Modelling real world phenomena with Markov chains requires estimating 
a large number of parameters. Even with ever growing amounts of data at 
disposal this task is often impossible to achieve without serious uncertainty 
in the estimates. Ignoring this fact and regarding the parameters as precise 
leads to overprecise unreliable results. The need for more robust models for 
probability has led to various models known under the common name as theory 
of imprecise probabilities (0)- In particular, for Markov chains the theory of 
imprecise Markov chains has been developed for discrete (Ena Eg) as well as 
continuous case ((26]). Most of the existing models are based on the theory of 
lower previsions (j22|)- 

In the core of the theory of imprecise Markov chains is the idea that tran¬ 
sition probabilities at each step are modelled with convex sets of probability 
distributions rather than single transition probabilities. Equivalently, all rele¬ 
vant probability distributions can then be modelled by non-additive functionals 
called coherent lower previsions, which are defined as lower envelopes of sets of 
additive functionals. 

Weights in graphs often also reflect some relation between vertices obtained 
on the basis of imperfect data. One way of expressing the resulting uncertainty 
is to use intervals instead of precise weights. While being a compelling gener¬ 
alisation, the related optimisation problems seem to be generally hard (M). 
Up until now finding minimum spanning tree and shortest paths in graphs with 
weighted intervals have received a lot attentions, while random walks seem to 
have not yet been explored. The lack of appropriate models of imprecise Markov 
chains and apparent high complexity of the general model might be among the 
reasons for this. The high complexity is also the main reason for our decision to 
keep our model simple by not allowing weights to vary completely freely within 
interval bounds, but instead assuming the sum of weights of edges incident to a 
given vertex to be constant. This could only be efficiently achieved by allowing 
loops, which then contain the non-allocated weight mass. 

1.2 Model 

The aim of the present article is to extend the theory of imprecise Markov chains 
for the case of reversible chains; more specifically, random walks on weighted 
graphs with interval weights. Interval weights are interpreted as sets containing 
the precise weights that will actually set the probabilities of transitions. We 
also assume that weights are not constant in time but rather at every time 
step an unknown mechanism selects new set of weights, for which, except that 
they belong to the given intervals, we have no information available. Once the 
weights at certain time step are selected, transitions are calculated in the usual 
way. In our model we restrict the set of weights by requiring that total sum of 
weights of edges incident to a vertex is constant and precisely known. This is 
achieved by assigning the remaining mass to the loops (i.e. edges connecting 
the same vertices). This restriction will allow an efficient local optimisation for 
calculation of multiple steps probability bounds. Actually a similar effect is 
the result of the rate of leaving a state when modelling continuous time Markov 
chains. Having precisely given marginals while dependencies are imprecise is not 
that uncommon since usually there is a lot more data available about marginal 
values than about dependencies. 

In comparison with the existing models of imprecise Markov chains the most 
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important differences are that probability models behind our model are not 
necessarily convex and that in general they do not satisfy Bellman’s principle of 
optimality (see e. g. [24]). Consequently calculating bounds for multiple steps 
transition probabilities is a much more computationally intensive task. 

We give the detailed description of the model in Sections [2] and [3j 

1.3 Results 

While our theoretical model is not very different from other models of imprecise 
Markov chains, there are substantial differences when it comes to computations. 
We will investigate computing n-steps transition probabilities, which are the 
basis for any analysis with Markov chains. As imprecision is involved, we cannot 
speak about single precisely given transition probabilities, but rather their lower 
and upper bounds. Moreover, in the case of imprecise probabilities, bounds for 
elementary events are not sufficient to specify the corresponding probability 
models. Therefore we have to consider computing bounds for more general 
expectations. 

The existing models of imprecise Markov chains allow setting transitions 
from one state to others independently from one another. This ensures con¬ 
vexity of the underlying probability models and possibility to apply Bellman’s 
principle of optimality. These properties then imply existence of a single local 
and therefore also global optimum, which is found by sequentially maximizing 
expectations via linear programming. Complexity of the problem thus remains 
linear in the number of time steps. The problem of finding extremal expectation 
in our settings becomes considerably more complicated. In general the problem 
is not convex and neither it satisfies Bellman’s principle. Consequently, in gen¬ 
eral multiple local optima exist, and that backwards induction is not applicable. 
This means that irreducible dimensionality of the problem grows exponentially 
with the number of time steps. 

Our main numerical result is a local optimisation algorithm which we pro¬ 
pose in Section [4] Given an initial weight function it returns a local optimal 
solution. Global extrema, though, are still sought by taking various starting 
points and do local optimisation. As the size of the space of all feasible points 
is far to big to be tractable by any reasonable computer, we cannot provide a 
criterion that would definitely ensure that obtained solution is global maximum. 
But numerical testing shows that in most cases a reasonable approximation of 
global solution can be obtained by taking a moderate number of starting points. 
Even more convincingly it shows that if weight functions were chosen by ran¬ 
dom, without applying the local optimization, then it would almost certainly 
take incomparably larger samples to get results comparably close to the optimal 
solution. While, as far as we are aware, no other algorithms exist for optimiza¬ 
tion of random walks on graphs with interval weights, we can only compare our 
method to random choice, which is therefore by far outperformed. 

2 Model settings 

Let A be a non empty set of states. We will usually denote the number of states 
by s. We consider random walks on the graph with vertices X and weighted 
edges that are given in the form of an interval weight function. The probabilities 
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of transitions between states are assumed to be proportional with the weights. 
More precisely, if w : X 2 —> M>o is a weight function then 

ut \-v(„ | V w(x,y) w(x, y) 

Pw\ X lV) — Pw\ X n +1 — y\X n X ) ( \ ~ TI T! \ ■ (1) 

E yeX W ( X ^) W(x) 

In this paper we assume that the denominator in the above fraction is a fixed 
function of the state. That is, we assume a precise function W: X —>■ R + as 
a sum of weights edges incident to a vertex. This restriction will enable us to 
obtain an efficient optimization algorithm. When modeling uncertainty we often 
have fairly good information on the long term distributions over the set of states 
(which are closely related to the corresponding weights) but much less certainty 
regarding the transition probabilities. We will thus allow weights w(x, y ) where 
x ^ y to vary freely within given intervals and the remaining weight mass will 
be used to model the loop weight w(x,x). 

Formally, we define a set 

W = < w: X 2 —> R> 0 : w(x,y) < w(x,y ) < w(x,y), ^ w(x,y) =W(x) > . 

I y^ x J 

where w and w are arbitrary such that 

w<w and w(x,y) < W{x). 

y&x 

y^x 

Every weight function in W defines transition probabilities via equation ([I]). Our 
aim is to provide some basic properties of the corresponding Markov chains. 

Additionally, to avoid problems with uniqueness of the invariant distribu¬ 
tions, we will assume that for all pairs of states 

either w(x, y) = 0 or w(x, y) > 0 (2) 

and that there is a path between every pair of states consisting of edges with 
strictly positive weights. 

3 Imprecise Markov chains 

3.1 Transition operators with separately specified rows 

Markov chains whose parameters are only partially known have been studied 
recently into some details under the name imprecise Markov chains (01251) or 
Markov set chains (mi)- Here we give basic ideas and notations related to the 
theory described in [5]. 

An imprecise Markov chain is a sequence (A'„) n gNu{o} °f random variables 
taking values in a finite state space X . The imprecise distribution correspond¬ 
ing to some X n is given in the form of a set of probability distributions A4 n 
consisting of distributions compatible with the given partial knowledge of the 
process. In the case where A4 n are convex, they can be equivalently described 
in terms of lower expectation functionals 

—n (/) = m .in ( 3 ) 

q&M n 
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where q: X —> R are probability mass functions corresponding to distributions 
in M and / an arbitrary real valued map on X. 

The transition law between states is also given in imprecise way. That is by 
assuming a set of transition operators T, which is called an imprecise transition 
operator. The following relation then holds 

M n+1 =M n T={qT: q£M n ,T £T}. (4) 

We do not assume transition probabilities being constant in time but only that 
they belong to the specified set of transition operators. Thus, an imprecise 
Markov chain is in principle time inhomogeneous with fixed constraints on tran¬ 
sition probabilities. 

Moreover, an imprecise transition operator T maps some / £ R* to a set 
Tf = {Tf: T £ T}. An imprecise transition operator T is said to have sepa¬ 
rately specified rows if for every T,T' £ T there exists T so that Tj = Tj for 
every j and i ^ k and T^j = T' k - for every j. That is the fcth row can be chosen 
independently from the choice of other rows. 

Now we have the following important property. If an imprecise transition 
operator T has separately specified rows then there exist the minimal and max¬ 
imal elements in the set Tf for every / £ R*, denoted by Tf and Tf. The 
mappings T:ft-> Tf and T: f K > Tf are called the lower and the upper 
transition operators respectively, and their values can be calculated via linear 
programming. Let E 0 be an initial lower expectation operator and T a lower 
transition operator. The expectation of f(X n ), where / £ R* is calculated by 
repeatedly applying T_ using linear programming and finally apply E 0 , again via 
linear programming: 

E n {f)=MT n f). (5) 

The above equation generalizes calculation of n-step transition probabilities, 
since, for instance, 

P(X n = y\X 0 =x) = E x {T n 1 M ) = T n l {y} (x ), 

where l a denotes the indicator function of the set A C X. While in the case 
of precise Markov chains n-step transition probabilities between single states 
completely determine the distribution of X n . in the case of imprecise transitions 
expectations ([5]) must be used instead (see [5] or [25] for more details). 


3.2 Transition operators on weighted interval graphs 

The transition operator with respect to a weight function w is a map T w : R^ —» 
R^ that is defined with 


Twf(x) = p w(x,y)f{y ) = Y 

vex yex [x > 

Similarly we can define the action of T w from the right by 

qT w (v) = Y Q( x )Pw(x,y) = Y g ( x ) W wt V \ ' 

.tea: — v w \ x ) 


xex 


( 6 ) 

(7) 


In particular, if q is a probability mass function corresponding to X n on the set 
of states then qT w is the probability mass function corresponding to X n+ i. 
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We will stick to our general assumption that transition operators are (non- 
specified) function of time, rather than being constant in time. We now extend 
naturally both operators to vectors of weight functions w = (w i,..., w n ) with 


T w f = T W1 ...T W J 

(8) 

qT w = qT Wl ... T Wn . 

(9) 


Given a set of weights W, we define an imprecise transition operator T = 
{ T w : w £ W}. It is clear that so defined imprecise transition operator does 
not possess the separately specified rows property. In fact there is no way to 
involve such a property, because of the symmetry of weights, that is the entry 
w(x,y) which determines transition probability from x to y also determines the 
reverse transition probability. Yet those two belong to different rows of the 
corresponding transition matrix, namely, the first one in the row corresponding 
to x and the second one to the row corresponding to y. As a consequence, the 
notion of an upper or lower transition operators does not make sense here, as 
there is no unique maximal or minimal elements in the set Tf := { T w f : T w £ 
T}. Moreover, while sets of the form T or T/ are convex, this is not any more 
the case with more general sets, such as T 2 := {T Wl T W2 : w\,w 2 £ T} or T 2 /. 
This also means that optimization methods based on linear programming that 
are successfully applied in the theory of imprecise Markov chains cannot be 
applied on our case. 

An imprecise Markov chain is said to be regular if there exists some positive 
integer r such that all transition operators in T n . where n > r, have all elements 
positive. According to our convention ([2]) every state is reachable from any other 
state, and since loops are also possible with strictly positive probability the chain 
is acyclic and therefore regular. It follows then (see 1233) that there exists the 
unique invariant set Ai of probability distributions. Assuming fixed marginals, 
it follows that in our case the marginal distribution 7r, where n(x) = is 

the common unique distribution corresponding to all operators in T. Thus, 
{7 t}7~ = {tt}, which implies that {n} is the unique invariant set of distributions. 

3.3 Reversibility 

One of the most important properties of Markov chains that can be represented 
as random walks on graphs is that they are reversible processes. That means 
that we have equal probability to observe a sequence of states if their order is 
reversed: 


P(X l =X!,...,X n = x n ) = P{ A'i = x n ,. .. , X n = aq), (10) 

assuming that (A’i = x\) = 7r(aq), where n r is the unique invariant distribution. 
Applying the above property to the case of n = 2 we obtain the detailed balance 
condition: 

7 t(x)P(x, y) = ir(y)P(y, x), (11) 

where P(x,y) is the transition probability between x and y. 

Clearly, a precise random walk on a graph with weight function w satisfies 
the detailed balance condition, due to the fact that 

P(X r =x,X 2 =y)= = P(X 1 = y, X 2 = x), 
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where W denotes the sum of weights for all edges. The symmetry of interval 
weights also clearly implies that the lower probabilities P(X\ = x, X 2 = y) and 

yj ( X 11) 

P( A'i = y, X 2 = x ) are the same and equal to ~ —. 

Similarly we can calculate probabilities of the elementary events for more 
consecutive steps. Thus, for instance 


P(X i — X\ , . . . , X n — X n ) — 


rail m.(xj,Xi + 1) 
w mzZwfri)' 


( 12 ) 


lj=2 ” 

which again is equal to the lower probability of the reversed sequence of states. 


4 Numerical calculations 


4.1 Calculating expectations bounds 


Although we have a very simple expression (121 that allows calculating the lower 


probabilities of sequences of states it cannot be used directly to find lower (or 
upper) probabilities of more general events. An example would be calculating 
the 2-step lower probability of transition from x to y. In the precise case knowing 
the probabilities of all chains of states of length 3 would allow calculating such 
probability: 

P(X 2 = y\X 0 =x) = J2 P ( X 2 = V\ x i = Z ) P ( X i = z \ X o = x). 

zGX 


However, the above formula is incorrect if lower probability P is taken instead 
of P. The reason is that lower probabilities are in general non-additive. In our 
case, for instance, the lower probability P(x,y) = P w (x,y) for some particular 
weight function w, but P(x , y') = P w > (x, y') for another weight function «/, and 
there is usually no weight function that would induce the lower probabilities 
simultaneously. 

What we need to find in the case of minimizing the 2-step transition proba¬ 
bilities is 


P(X 2 = y\Xx = x) = min y ' P Wl (x, z)P W2 (z,y). (13) 

wi,wj€W z —* 
z€iX 

A more general problem is to find the bounds for the expectation of some func¬ 
tion f(X n ) given the information that Xq = x, or some probability distribution 
of Xo over the set of states. But even if this seems like an unnecessarily more 
complex problem, we would in fact not gain much in terms of simplicity by 
restricting to n-step transition probabilities alone. 

A (precise) probability distribution over X can be described via probability 
mass function (pmf) q: X —» R>o where q{x) = P(X o = x). The expectation 
of some / € R* with respect to q is the scalar product 

id, f) ■= E q {f) = q(x)f(x). (14) 

xex 

To extend the above formulation for more general case of the expectation after 
n steps, we consider the vector of weight functions w = (w\,... ,w n ). Thus at 
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k- th step the weight function Wk is assumed to induce the transitions. We will 
then denote 


(<?,/>” := (q,T w f). 


(15) 


Our goal is to find 


(<b /)" = min (q, /)” and 
- weW" 


(q, f) n = max (q, /)£ 

w (=W n 


(16) 


Since, clearly, (q, f) = — (g, —/) n , we only need to consider the minimization 
version. Allowing arbitrary real valued functions q instead of restricting to 
probability mass functions does not change anything in the sense of problem 
complexity. Therefore we will from now on assume q and / to be arbitrary real 
valued functions. 

Proposition 1. Let f and q £ R* and w £ W n . Then (qT w ,f) = (q,T w f). 

Proof. We proceed by induction on the length n of vector w, where the case 
n = 0 is trivial. Take now 


(qT w ,f) = (qT Wl ,T( W2t ... !Wn )f) 


denote / = T( W2 and continue with 


= q T ^(y)f(y) 

y&x 


= «(*) 

y£X xEX 


Wi (x,y) 
W(x) 


f(y) 


= ^ 
x£X y£X 


wi{x,y) 

W{x) 


f(y) 


= (q,T Wl f) = (q,T w f). 


□ 


Clearly the following holds. 

Proposition 2. The mapping (q,w,f) 1 —► (q, f)^ is linear in all variables. 


The first step to finding bounds (16) is to set w = (w), that is to take a 
single step, and find the weight function that makes the expectation (q, f)]^ 
extremal. 


Proposition 3 (Optimality principle). Let q, f: X — > R and w be some weight 
function. Set h{x) = q(x)/W(x) for every x £ X and define 

i>h,f(x,y) = (h(x) - h(y))(f(y) - f(x)). 

Then (q, /)£, = (q, f) 1 if 


I w{x,y) 

I w{x,y) 


ifi>h,f{x,y) > 0; 
if 'f>hj{x,y) < 0. 


w{x,y) 










Proof. We prove the proposition by contradiction. Suppose that iphj(,Xo, Do) > 
0 but w(xo,yo) > w(xo,yo). Let 0 < d < w(xo 1 yo) — 2L(xo>2/o) and set 


w\x,y) 


! w{x,y) if (x,y) <£ {(ar 0 ,y 0 ), (x 0 ,x 0 ), (Vo,yo), ( 2 / 0 , x 0 )}; 
w{x, y)-d if (x, y) G {(x 0 , y 0 ), (y 0 , x 0 )}; 
w(x, y) + d if (x, y) G {(x 0 , x 0 ), (yo, yo)j- 


We have that 


(qj)i' 


«(*) E 

x$lX y£X 


w'{x,y) 

W{x) 


w '( x ^y) h ( x )f(y) 

x£X y£X 


= (<?, f)w + d(h(x 0 )f(x 0 ) + h(y 0 )f(y 0 ) - h(x 0 )f{y 0 ) - h(y 0 )f(x 0 )) 
= {q, f)l, - dip h j(xo,yo) 

< {q,f)i, 


which contradicts minimality of (q, f)^- 

The case where iphj{xo,yo) < 0 is proved similarly. □ 

Corollary 1. Let q, f G be arbitrary mappings. Then 

min {q,f)Z ( 17 ) 

wGW n 


is attained in a vector w = (w \,..., w n ), where all Wi are extremal. 

Proof. Let w = (wi ,..., w n ) be a vector minimizing We will show that 

then there exists a vector w' with extremal components such that (q, /)^, > 
(qJ) w- 

Suppose that ith component Wi of w is not extremal. Then set q = qT( Wl ,...wi_i) 
and / = T( Wi _ l It easily follows from Proposition |3] that the expres¬ 

sion (q, f)}^ is minimized by an extremal weight function w. Thus (q, f)]^ > 
{q>f)wi — (Qif)w- Thus replacing Wi with w in w would still maximize ( fl7| . 
By repeating the same argument for all components with non-extremal weight 
functions, we confirm the corollary. □ 


4.2 The structure of the set of extremal weights 

The problem of calculating the extreme value and the optimal weights vector in 
the expression © is a high dimensional optimization problem. We need to find 
the optimal value of an nm dimensional real-valued vector where n stands for 
the number of time steps and m for the number of edges of the graph. Knowing 
that all feasible vectors of weights where the extremal values may be found is 
restricted to the extremal weight vectors brings the number of the values to be 
considered down to the set of all nm dimensional binary vectors. That gives 
2“ possibilities, which is in general still far more than a number tractable on 
any computer system. Moreover, it is not obvious how to endow this space 
with such a metric or norm structure that would help with the optimization. 
Therefore the aim of the rest of this paper is to give a heuristic approach to 
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finding reasonable approximations of the target extreme values and insights 
into the practical complexity of the problem, which is apparently too hard to 
be tackled exactly. 

Yet there is an additional reduction of the complexity, which follows from 
the theoretical results in Section [4.1| Every component wq of an extremal weight 
vector w must be of the form w$, where is some function of the form i/Jhj- 
In fact the sign of ifthj on ly matters. Clearly the sign of iphj depends on the 
ordering of states induced by the maps h and /. In other words, every pair 
of orderings of states induces a feasible extremal weight function. That is, for 
every component we need to consider at most (s!) 2 extremal weight functions 
instead of 2 m of all possible extremal weight functions, where s denotes the 
number of states and m the number of edges in the graph. The number of 
feasible weights additionally decreases when taking into account the orderings 
of q and /, but still remains much too large to allow any kind of exhaustive 
examination. Therefore a reasonable approach is to start with some initial 
weight vector and try to improve it repeatedly until finding an optimal solution. 
Though, as we show in the continuation this process does not necessarily lead 
to a global optimum. 

4.3 Finding local extrema 

Corollary [I] gives a necessary condition for w to give extremal value of ( q , /)” . 
However, there may be several weight vectors satisfying this condition, yielding 
different values. That is, we may have multiple local extrema due to non-convex 
nature of the problem. Let us illustrate this with an example. 

Example 1 . Let X = {1,2}, 



0.1 

0.2 


0.8 

0.9' 

w = 

0.2 

0.1 

,w = 

0.9 

0.8 


q = (1,0), and / = (0,1). The marginal weight function is then W = (1,1). 
Our goal is to minimize qT Wl T W2 f, which in our case corresponds to the lower 
two step transition probability P_{X 2 = 2|Y 0 = 1). 

Take 



'0.1 

0.9' 

/ 

'0.8 

0.2' 

w = 

0.9 

0.1 

,w = 

0.2 

0.8 


which are the only extremal weight functions. The transition operators corre¬ 
sponding to w and w' coincide with w and w' because of both marginal weights 
being equal to 1. 

Now we have that / 2 := T w f = (0.9, 0.1) and f 2 := T w >f = (0.2, 0.8). Hence 


si s n (Vw 2 ) 


0 1 
1 0 


and sign(^ g ,/') 


0 0 
0 0 


Hence the expression qT Wl fi is minimized by taking W\ = w and similarly the 
expression qT w ^f 2 is minimized by taking w[ = w'. 

To make things clearer we will parametrize all two-step weight vectors. Let 

uq(a) = aw + (1 — a)w' and W2{(3) = f3w + (1 — (3)w'. 
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Now we can express explicitly 

T(a, ft) = qT Wl{a) T wm f = 0.32 + 0.42a + 0.42/3 — 0.98a/3, 

whose local extrema in [0, l] 2 are F(0,0) = 0.32 and F( 1,1) = 0.18 which are 
local minima while in the local maxima in (0,1) and (1, 0) the same value 0.74 
is attained. The only stationary point in the interior of [0, l] 2 is a saddle point 
and thus not a local extreme. 

Having characterized local extrema, we can now provide a simple Algo¬ 
rithm [I] that finds local extrema from some starting weight vector w. We have 


Algorithm 1 FindLocalMinimum 

1 

function LocalMinimum(< 7 , initialWeightVector , /) 

2 

weightVector 4— initialWeightVector 


3 

repeat 


4 

k 4— setSplitPoint 

> set the division point 

5 

wl <r- weightV ector {1 : k — 1) 

> left part of the weight vector 

6 

wr 4— weightVector(k + 1 : n) 

> right part of the weight vector 

7 

ql <— qT w i 


8 

fr 4- T wr f 


9 

wnew <r- w^ ql fr t> set the weight function w that minimizes qlT w fr 

10 

weightVector 4— (wl,wnew,wr) 


11 

until there are no more division points where qlTw^fr is not minimal 

12 

end function 



left the way how the split point is selected open on purpose in Algorithm [l] 
line [4j because there are many ways how we can proceed with this. Thus, we 
can for instance start from left to right and repeat the process until we find 
no more locally non-optimal weight functions. Or we might go the other way 
around. The order does affect the results. That is, not only that the number of 
iterations needed may be different, but also the resulting locally optimal weights 
vector w may be different. 


4.4 Testing the local algorithm 

It takes just a slight modification of Algorithm [l] to find local maxima instead of 
minima. To understand its efficiency for finding global extrema, we implemented 
some numerical testing with random interval weights in graphs of various sizes 
to answer the following relevant questions: 


1. How many unique local extrema does typical problem of the form (16) 
have? 


2. How does the order of split points affect the resulting local extreme? 

3. Can the value of ( q , /)2, initial i n an y way predict the value in the resulting 
local extreme? 


4. Does the value in a local extreme affect the likelihood of the algorithm 
resulting in that extreme? 
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2 

time steps 

4 6 

4 

1.9 

13.4 80.6 

vertices 6 

3.2 

46.8 251.2 

8 

5.3 

100.3 411.4 


Table 1: Average number of local extrema. 


We have tested the algorithm on graphs with 4, 6 and 8 vertices. Lower weights 
w{x, y) were randomly generated using exponential distribution with expected 
value 0.8 and the upper bounds were generated as w(x,y) = w(x,y)X where 
X were random numbers distributed exponentially with the expected value 1. 
Entries of both q and / were generated as random numbers distributed expo¬ 
nentially with expected value 1.5. Each graph contained about 1/4 of pairs of 
vertices that were not connected. 

For each set of parameters, i.e. w,w,q and /, a sample of 1500 extremal 
weight vectors was generated using random permutations (see the last part of 
Section 4.21 and then calculated the corresponding local extreme using Algo¬ 
rithm [T We tested for random walks of lengths 2, 4 and 6 respectively. For 
each size of graphs and random walk length a sample of 200 sets of parameters 
was generated. 

Typical distribution of the number of (discovered) extreme points is best 
modelled with exponential distribution with the means that depend on the num¬ 
ber of vertices and time steps. The parameters are listed in Table |T| Further 
we tested the influence of the order in which non-optimal weight functions are 
selected and optimized in Algorithm [lj line [4] We tested left-to-right and right- 
to-left order. It turns out that most of the time the resulting local extrema do 
not coincide. However, the comparison of overall frequencies of the obtained 
local extrema do not show any systematical differences in their distributions. In 
Figure [l] frequencies of local minima and maxima respectively are depicted for 
the left-to-right and right-to-left orders. One can observe similar distributions 
for both orders. 

One might expect that the starting weight function can give some informa¬ 
tion about the local extreme it leads to. It could be for instance that low value 
in the starting point predicts lower value in the resulting local minimum, which 
could be used to do preliminary selection of the starting points. However, em¬ 
pirical results show no significant correlation between the two values. A graph 
showing dependency between the initial and optimized value is shown in Fig¬ 
ure^ It clearly shows that the initial value does not reveal much information 
about the corresponding local maximum. Although the efficiency of the algo¬ 
rithm can be clearly observed by comparing the initial and optimized values. 
Experimental results neither show any particular general pattern that would 
suggest that the value of a local extreme would impact the probability to be 
attained by the local optimization algorithm. 


4.5 From local to global extrema 

So far the best way to find global extrema for random walks in weighted graphs 
is to take some sample of random initial weight vectors, do the local optimiza- 


12 










500 


400 


300 


200 


100 





l-r: lower 






l-r: upper 



- 



r-l: lower 


- 




r-l: upper 










- 


1.8 


1.9 


2.1 


2.2 


2.3 


2.4 


2.5 


2.6 


2.7 


2.8 


Figure 1: Frequencies of left-to-right and right-to-left order in Algorithm [I] for 
lower and upper bounds. 
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Figure 2: Comparison of the initial value of the maximizing function and the 
value after applying the local optimization algorithm. 


tion, and hope that one of the so obtained local extrema is a global extreme. 
This method, however, does not contain any decisive criterion whether the best 
obtained solution is globally best solution. It is therefore not clear how big 
sample one has to take in order to get an estimate reasonably close to the true 
optimal solution with reasonable certainty. In most cases not too large samples 
are needed for this. We have generated a sample of 300 parameter sets for 
interval weighted graphs with 8 vertices and the same number of time steps. All 
q and / were non-negative to ensure non-negativity of the resulting optima and 
therefore comparison of relative deviations. Both interval weights and widths 
were generated as exponentially distributed random values. For each parameter 
set a sample of 2500 randomly generated initial points was generated and then 
optimized to get the corresponding local extrema. We have calculated the values 
(g, /)^ for the initial extremal vector of weights w and for the optimized weight 
vector w opt . In Figure [3] the average relative deviation, in percentage of the 
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Figure 3: Average relative deviations from optimal solution for given sample of 
initial points with and without local optimization. 



Figure 4: Maximal relative deviations from optimal solution for given sample 
of initial points with and without local optimization. 


best solution, from the most optimal value found is graphed for initial and opti¬ 
mized vector weights depending on the size of the sample. In Figure[4]the worst 
case (maximal) relative deviation is graphed depending on the sample size. It 
can be clearly observed that locally optimized solutions by far outperform best 
solutions that might be found by randomly generated extremal weights only. 
Secondly, we can also observe that even if it cannot be guaranteed that best so- 
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lution has been found among locally optimized solutions, the deviations become 
reasonable at not too big sample sizes. 

5 Conclusion and further work 

As far as we are aware this paper is a first attempt to model random walks 
on weighted graphs with interval weights, and also reversible imprecise Markov 
chains. We have addressed the most basic question about calculating transi¬ 
tion probabilities for multiple time steps. This is a non-convex problem whose 
computational complexity grows exponentially with the number of time steps. 
Thanks to the local optimization algorithm we can find reasonable approxima¬ 
tions of global optima with a tractable amount of computation. 

Our approach only works in the case where marginals are known precisely. 
It is therefore natural to try to extend calculations without the assumption of 
fixed sum of weights and even to allow more general sets of weight functions 
besides those which are described in terms of intervals. While both possibilities 
would be plausible, computational complexity may be an obstacle to prevent 
their efficient analysis. 

Regarding long term distribution our model is a simple one with known limit 
distribution. Although there are still several relevant questions to be addressed, 
such as computing mixing times, cover and hitting times. 
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